Frameworks, Implementation And Open Problems For The Collaborative Building Of A Multilingual Lexical Database

نویسندگان

  • Mathieu Mangeot-Lerebours
  • Gilles Serasset
  • Frederic Andres
چکیده

Many NLP systems are based on lexical data. The development costs of such data are a major drawback in such NLP systems. In order to cut these costs, we adopt a strategy inspired from "opensource" projects to allow volunteers to collaborate in the creation of a multilingual lexical database. For this, we had to specify and develop tools to manage a lexical database containing information complete and detailed enough to be usable for a wide range of applications. This paper presents our project and details the tools, frameworks and structures used to manage such a database. We will also show some research problems still to be addressed in this context.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Building Multilingual Search Index using open source framework

This paper presents a comparison of open source search engine development frameworks in the context of their malleability for constructing multilingual search index. The comparison study reveals that none of these frameworks are designed for this task. This paper elicits the challenges involved in building a multilingual index. We also discuss policy decisions and the implementation changes mad...

متن کامل

A Generic Collaborative Platform For Multilingual Lexical Database Development

The motivation of the Papillon project is to encourage the development of freely accessible Multilingual Lexical Resources by way of online collaborative work on the Internet. For this, we developed a generic community website originally dedicated to the diffusion and the development of a particular acception based multilingual lexical database. The generic aspect of our platform allows its use...

متن کامل

The Automatic Mapping of Princeton WordNet Lexical-Conceptual Relations onto the Brazilian Portuguese WordNet Database

Princeton WordNet (WN.Pr) lexical database has motivated efficient compilations of bulky relational lexicons since its inception in the 1980 ́s. The EuroWordNet project, the first multilingual initiative built upon WN.Pr, opened up ways of building individual wordnets, and interrelating them by means of the so-called Inter-Lingual-Index, an unstructured list of the WN.Pr synsets. Other important...

متن کامل

Lexical Database for Multiple Languages: Multilingual Word Semantic Network

Data mining and knowledge engineering have become a tough task due to the availability of large amount of data in the web nowadays. Validity and reliability of data also become a main debate in knowledge acquisition. Besides, acquiring knowledge from different languages has become another concern. There are many language translators and corpora developed but the function of these translators an...

متن کامل

The Effects of Collaborative Versus Non-collaborative Massed and Distributed Presentation on the Comprehension and Production of Lexical Collocations

To investigate the effect of massed and distributed collaborative and non-collaborative presentation on L2 learners’ comprehension and production of lexical collocations, 105 participants at Takestan Islamic Azad University in 4 groups were assigned to four different treatment conditions (collaborative-massed; collaborative-distributed; noncollaborative-massed; and noncollaborative-distributed ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002